---
title: Installing on Kubernetes
hide_title: true
sidebar_position: 3
version: 1
---

## Installing on Kubernetes

Running Superset on Kubernetes is supported with the provided [Helm](https://helm.sh/) chart found in the official [Superset helm repository](https://apache.github.io/superset/index.yaml).

### Prerequisites

- A Kubernetes cluster
- Helm installed

### Running

1. Add the Superset helm repository

```sh
helm repo add superset https://apache.github.io/superset
"superset" has been added to your repositories
```

2. View charts in repo

```sh
helm search repo superset
NAME                    CHART VERSION   APP VERSION     DESCRIPTION
superset/superset       0.1.1           1.0             Apache Superset is a modern, enterprise-ready b...
```

3. Configure your setting overrides

Just like any typical Helm chart, you'll need to craft a `values.yaml` file that would define/override any of the values exposed into the default [values.yaml](https://github.com/apache/superset/tree/master/helm/superset/values.yaml), or from any of the dependent charts it depends on:

- [bitnami/redis](https://artifacthub.io/packages/helm/bitnami/redis)
- [bitnami/postgresql](https://artifacthub.io/packages/helm/bitnami/postgresql)

More info down below on some important overrides you might need.

4. Install and run

```sh
helm upgrade --install --values my-values.yaml superset superset/superset
```

You should see various pods popping up, such as:

```sh
kubectl get pods
NAME                                    READY   STATUS      RESTARTS   AGE
superset-celerybeat-7cdcc9575f-k6xmc    1/1     Running     0          119s
superset-f5c9c667-dw9lp                 1/1     Running     0          4m7s
superset-f5c9c667-fk8bk                 1/1     Running     0          4m11s
superset-init-db-zlm9z                  0/1     Completed   0          111s
superset-postgresql-0                   1/1     Running     0          6d20h
superset-redis-master-0                 1/1     Running     0          6d20h
superset-worker-75b48bbcc-jmmjr         1/1     Running     0          4m8s
superset-worker-75b48bbcc-qrq49         1/1     Running     0          4m12s
```

The exact list will depend on some of your specific configuration overrides but you should generally expect:

- N `superset-xxxx-yyyy` and `superset-worker-xxxx-yyyy` pods (depending on your `supersetNode.replicaCount` and `supersetWorker.replicaCount` values)
- 1 `superset-postgresql-0` depending on your postgres settings
- 1 `superset-redis-master-0` depending on your redis settings
- 1 `superset-celerybeat-xxxx-yyyy` pod if you have `supersetCeleryBeat.enabled = true` in your values overrides

1. Access it

The chart will publish appropriate services to expose the Superset UI internally within your k8s cluster. To access it externally you will have to either:

- Configure the Service as a `LoadBalancer` or `NodePort`
- Set up an `Ingress` for it - the chart includes a definition, but will need to be tuned to your needs (hostname, tls, annotations etc...)
- Run `kubectl port-forward superset-xxxx-yyyy :8088` to directly tunnel one pod's port into your localhost

Depending how you configured external access, the URL will vary. Once you've identified the appropriate URL you can log in with:

- user: `admin`
- password: `admin`

### Important settings

#### Security settings

Default security settings and passwords are included but you **SHOULD** override those with your own, in particular:

```yaml
postgresql:
  postgresqlPassword: superset
```

Make sure, you set a unique strong complex alphanumeric string for your SECRET_KEY and use a tool to help you generate
a sufficiently random sequence.

- To generate a good key you can run, `openssl rand -base64 42`

```yaml
configOverrides:
  secret: |
    SECRET_KEY = 'YOUR_OWN_RANDOM_GENERATED_SECRET_KEY'
```

If you want to change the previous secret key then you should rotate the keys.
Default secret key for kubernetes deployment is `thisISaSECRET_1234`

```yaml
configOverrides:
  my_override: |
    PREVIOUS_SECRET_KEY = 'YOUR_PREVIOUS_SECRET_KEY'
    SECRET_KEY = 'YOUR_OWN_RANDOM_GENERATED_SECRET_KEY'
init:
  command:
    - /bin/sh
    - -c
    - |
      . {{ .Values.configMountPath }}/superset_bootstrap.sh
      superset re-encrypt-secrets
      . {{ .Values.configMountPath }}/superset_init.sh
```

:::note
Superset uses [Scarf Gateway](https://about.scarf.sh/scarf-gateway) to collect telmetry data to better understand and support the need for patch versions of Sueprset. Scarf purges PII and provides aggregated statistics. Superset users can easily opt out of analytics in various ways documented [here](https://docs.scarf.sh/gateway/#do-not-track). However, if you wish to opt-out of this in your Helm-based installation, you can simply edit your `helm/superset/values.yaml` file and remove `apachesuperset.docker.scarf.sh/` from the `repository` field, so that it's simply pulling `apache/superset`
:::

#### Dependencies

Install additional packages and do any other bootstrap configuration in the bootstrap script.
For production clusters it's recommended to build own image with this step done in CI.

:::note

Superset requires a Python DB-API database driver and a SQLAlchemy
dialect to be installed for each datastore you want to connect to.

See [Install Database Drivers](/docs/databases/installing-database-drivers) for more information

:::

The following example installs the Big Query and Elasticsearch database drivers so that you can
connect to those datasources in your Superset installation:

```yaml
bootstrapScript: |
  #!/bin/bash
  pip install psycopg2==2.9.6 \
    sqlalchemy-bigquery==1.6.1 \
    elasticsearch-dbapi==0.2.5 &&\
  if [ ! -f ~/bootstrap ]; then echo "Running Superset with uid {{ .Values.runAsUser }}" > ~/bootstrap; fi
```

#### superset_config.py

The default `superset_config.py` is fairly minimal and you will very likely need to extend it. This is done by specifying one or more key/value entries in `configOverrides`, e.g.:

```yaml
configOverrides:
  my_override: |
    # This will make sure the redirect_uri is properly computed, even with SSL offloading
    ENABLE_PROXY_FIX = True
    FEATURE_FLAGS = {
        "DYNAMIC_PLUGINS": True
    }
```

Those will be evaluated as Helm templates and therefore will be able to reference other `values.yaml` variables e.g. `{{ .Values.ingress.hosts[0] }}` will resolve to your ingress external domain.

The entire `superset_config.py` will be installed as a secret, so it is safe to pass sensitive parameters directly... however it might be more readable to use secret env variables for that.

Full python files can be provided by running `helm upgrade --install --values my-values.yaml --set-file configOverrides.oauth=set_oauth.py`

#### Environment Variables

Those can be passed as key/values either with `extraEnv` or `extraSecretEnv` if they're sensitive. They can then be referenced from `superset_config.py` using e.g. `os.environ.get("VAR")`.

```yaml
extraEnv:
  SMTP_HOST: smtp.gmail.com
  SMTP_USER: user@gmail.com
  SMTP_PORT: "587"
  SMTP_MAIL_FROM: user@gmail.com

extraSecretEnv:
  SMTP_PASSWORD: xxxx

configOverrides:
  smtp: |
    import ast
    SMTP_HOST = os.getenv("SMTP_HOST","localhost")
    SMTP_STARTTLS = ast.literal_eval(os.getenv("SMTP_STARTTLS", "True"))
    SMTP_SSL = ast.literal_eval(os.getenv("SMTP_SSL", "False"))
    SMTP_USER = os.getenv("SMTP_USER","superset")
    SMTP_PORT = os.getenv("SMTP_PORT",25)
    SMTP_PASSWORD = os.getenv("SMTP_PASSWORD","superset")
```

#### System packages

If new system packages are required, they can be installed before application startup by overriding the container's `command`, e.g.:

```yaml
supersetWorker:
  command:
    - /bin/sh
    - -c
    - |
      apt update
      apt install -y somepackage
      apt autoremove -yqq --purge
      apt clean

      # Run celery worker
      . {{ .Values.configMountPath }}/superset_bootstrap.sh; celery --app=superset.tasks.celery_app:app worker
```

#### Data sources

Data source definitions can be automatically declared by providing key/value yaml definitions in `extraConfigs`:

```yaml
extraConfigs:
  import_datasources.yaml: |
    databases:
    - allow_file_upload: true
      allow_ctas: true
      allow_cvas: true
      database_name: example-db
      extra: "{\r\n    \"metadata_params\": {},\r\n    \"engine_params\": {},\r\n    \"\
        metadata_cache_timeout\": {},\r\n    \"schemas_allowed_for_file_upload\": []\r\n\
        }"
      sqlalchemy_uri: example://example-db.local
      tables: []
```

Those will also be mounted as secrets and can include sensitive parameters.

### Configuration Examples

#### Setting up OAuth

:::note

OAuth setup requires that the [authlib](https://authlib.org/) Python library is installed. This can
be done using `pip` by updating the `bootstrapScript`. See the [Dependencies](#dependencies) section
for more information.

:::

```yaml
extraEnv:
  AUTH_DOMAIN: example.com

extraSecretEnv:
  GOOGLE_KEY: xxxxxxxxxxxx-xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx.apps.googleusercontent.com
  GOOGLE_SECRET: xxxxxxxxxxxxxxxxxxxxxxxx

configOverrides:
  enable_oauth: |
    # This will make sure the redirect_uri is properly computed, even with SSL offloading
    ENABLE_PROXY_FIX = True

    from flask_appbuilder.security.manager import AUTH_OAUTH
    AUTH_TYPE = AUTH_OAUTH
    OAUTH_PROVIDERS = [
        {
            "name": "google",
            "icon": "fa-google",
            "token_key": "access_token",
            "remote_app": {
                "client_id": os.getenv("GOOGLE_KEY"),
                "client_secret": os.getenv("GOOGLE_SECRET"),
                "api_base_url": "https://www.googleapis.com/oauth2/v2/",
                "client_kwargs": {"scope": "email profile"},
                "request_token_url": None,
                "access_token_url": "https://accounts.google.com/o/oauth2/token",
                "authorize_url": "https://accounts.google.com/o/oauth2/auth",
                "authorize_params": {"hd": os.getenv("AUTH_DOMAIN", "")}
            },
        }
    ]

    # Map Authlib roles to superset roles
    AUTH_ROLE_ADMIN = 'Admin'
    AUTH_ROLE_PUBLIC = 'Public'

    # Will allow user self registration, allowing to create Flask users from Authorized User
    AUTH_USER_REGISTRATION = True

    # The default user self registration role
    AUTH_USER_REGISTRATION_ROLE = "Admin"
```

#### Enable Alerts and Reports

For this, as per the [Alerts and Reports doc](/docs/installation/email-reports), you will need to:

##### Install a supported webdriver in the Celery worker

This is done either by using a custom image that has the webdriver pre-installed, or installing at startup time by overriding the `command`. Here's a working example for `chromedriver`:

```yaml
supersetWorker:
  command:
    - /bin/sh
    - -c
    - |
      # Install chrome webdriver
      # See https://github.com/apache/superset/blob/4fa3b6c7185629b87c27fc2c0e5435d458f7b73d/docs/src/pages/docs/installation/email_reports.mdx
      apt update
      wget https://dl.google.com/linux/direct/google-chrome-stable_current_amd64.deb
      apt install -y --no-install-recommends ./google-chrome-stable_current_amd64.deb
      wget https://chromedriver.storage.googleapis.com/88.0.4324.96/chromedriver_linux64.zip
      unzip chromedriver_linux64.zip
      chmod +x chromedriver
      mv chromedriver /usr/bin
      apt autoremove -yqq --purge
      apt clean
      rm -f google-chrome-stable_current_amd64.deb chromedriver_linux64.zip

      # Run
      . {{ .Values.configMountPath }}/superset_bootstrap.sh; celery --app=superset.tasks.celery_app:app worker
```

##### Run the Celery beat

This pod will trigger the scheduled tasks configured in the alerts and reports UI section:

```yaml
supersetCeleryBeat:
  enabled: true
```

##### Configure the appropriate Celery jobs and SMTP/Slack settings

```yaml
extraEnv:
  SMTP_HOST: smtp.gmail.com
  SMTP_USER: user@gmail.com
  SMTP_PORT: "587"
  SMTP_MAIL_FROM: user@gmail.com

extraSecretEnv:
  SLACK_API_TOKEN: xoxb-xxxx-yyyy
  SMTP_PASSWORD: xxxx-yyyy

configOverrides:
  feature_flags: |
    import ast

    FEATURE_FLAGS = {
        "ALERT_REPORTS": True
    }

    SMTP_HOST = os.getenv("SMTP_HOST","localhost")
    SMTP_STARTTLS = ast.literal_eval(os.getenv("SMTP_STARTTLS", "True"))
    SMTP_SSL = ast.literal_eval(os.getenv("SMTP_SSL", "False"))
    SMTP_USER = os.getenv("SMTP_USER","superset")
    SMTP_PORT = os.getenv("SMTP_PORT",25)
    SMTP_PASSWORD = os.getenv("SMTP_PASSWORD","superset")
    SMTP_MAIL_FROM = os.getenv("SMTP_MAIL_FROM","superset@superset.com")

    SLACK_API_TOKEN = os.getenv("SLACK_API_TOKEN",None)
  celery_conf: |
    from celery.schedules import crontab

    class CeleryConfig(object):
      broker_url = f"redis://{env('REDIS_HOST')}:{env('REDIS_PORT')}/0"
      imports = ('superset.sql_lab', "superset.tasks", "superset.tasks.thumbnails", )
      result_backend = f"redis://{env('REDIS_HOST')}:{env('REDIS_PORT')}/0"
      task_annotations = {
          'sql_lab.get_sql_results': {
              'rate_limit': '100/s',
          },
          'email_reports.send': {
              'rate_limit': '1/s',
              'time_limit': 600,
              'soft_time_limit': 600,
              'ignore_result': True,
          },
      }
      beat_schedule = {
          'reports.scheduler': {
              'task': 'reports.scheduler',
              'schedule': crontab(minute='*', hour='*'),
          },
          'reports.prune_log': {
              'task': 'reports.prune_log',
              'schedule': crontab(minute=0, hour=0),
          },
          'cache-warmup-hourly': {
              'task': 'cache-warmup',
              'schedule': crontab(minute='*/30', hour='*'),
              'kwargs': {
                  'strategy_name': 'top_n_dashboards',
                  'top_n': 10,
                  'since': '7 days ago',
              },
          }
      }

    CELERY_CONFIG = CeleryConfig
  reports: |
    EMAIL_PAGE_RENDER_WAIT = 60
    WEBDRIVER_BASEURL = "http://{{ template "superset.fullname" . }}:{{ .Values.service.port }}/"
    WEBDRIVER_BASEURL_USER_FRIENDLY = "https://www.example.com/"
    WEBDRIVER_TYPE= "chrome"
    WEBDRIVER_OPTION_ARGS = [
        "--force-device-scale-factor=2.0",
        "--high-dpi-support=2.0",
        "--headless",
        "--disable-gpu",
        "--disable-dev-shm-usage",
        # This is required because our process runs as root (in order to install pip packages)
        "--no-sandbox",
        "--disable-setuid-sandbox",
        "--disable-extensions",
    ]
```
